Stone game [Predict the Winner]¶
Time: O(N^2); Space: O(N); medium
Alex and Lee play a game with piles of stones. There are an even number of piles arranged in a row, and each pile has a positive integer number of stones piles[i].
The objective of the game is to end with the most stones. The total number of stones is odd, so there are no ties.
Alex and Lee take turns, with Alex starting first. Each turn, a player takes the entire pile of stones from either the beginning or the end of the row. This continues until there are no more piles left, at which point the person with the most stones wins.
Assuming Alex and Lee play optimally, return True if and only if Alex wins the game.
Example 1:
Input: piles = [5,3,4,5]
Output: True
Explanation:
Alex starts first, and can only take the first 5 or the last 5.
Say he takes the first 5, so that the row becomes [3, 4, 5].
If Lee takes 3, then the board is [4, 5], and Alex takes 5 to win with 10 points.
If Lee takes the last 5, then the board is [3, 4], and Alex takes 4 to win with 9 points.
This demonstrated that taking the first 5 was a winning move for Alex, so we return true.
Constraints:
2 <= len(piles) <= 500
len(piles) is even.
1 <= piles[i] <= 500
sum(piles) is odd.
1. Dynamic Programming [O(N^2), O(N^2)]¶
Intuition
Let’s change the game so that whenever Lee scores points, it deducts from Alex’s score instead.
Let dp(i, j) be the largest score Alex can achieve where the piles remaining are piles[i], piles[i+1], …, piles[j]. This is natural in games with scoring: we want to know what the value of each position of the game is.
We can formulate a recursion for dp(i, j) in terms of dp(i+1, j) and dp(i, j-1), and we can use dynamic programming to not repeat work in this recursion. (This approach can output the correct answer, because the states form a DAG (directed acyclic graph).)
Algorithm
When the piles remaining are piles[i], piles[i+1], …, piles[j], the player who’s turn it is has at most 2 moves.
The person who’s turn it is can be found by comparing j-i to N modulo 2.
If the player is Alex, then she either takes piles[i] or piles[j], increasing her score by that amount. Afterwards, the total score is either piles[i] + dp(i+1, j), or piles[j] + dp(i, j-1); and we want the maximum possible score.
If the player is Lee, then he either takes piles[i] or piles[j], decreasing Alex’s score by that amount. Afterwards, the total score is either -piles[i] + dp(i+1, j), or -piles[j] + dp(i, j-1); and we want the minimum possible score.
[16]:
from functools import lru_cache
class Solution1(object):
"""
Time: O(N^2), where N is the number of piles
Space: O(N^2), the space used storing the intermediate results of each subgame
"""
def stoneGame(self, piles):
"""
:type piles: List[int]
:rtype: bool
"""
N = len(piles)
@lru_cache(None)
def dp(i, j):
# The value of the game [piles[i], piles[i+1], ..., piles[j]].
if i > j: return 0
parity = (j - i - N) % 2
if parity == 1: # first player
return max(piles[i] + dp(i+1,j), piles[j] + dp(i,j-1))
else:
return min(-piles[i] + dp(i+1,j), -piles[j] + dp(i,j-1))
return dp(0, N - 1) > 0
[17]:
s = Solution1()
piles = [5,3,4,5]
assert s.stoneGame(piles) == True
2. Dynamic Programming [O(N^2), O(N)]¶
[18]:
class Solution2(object):
"""
Time: O(N^2)
Space: O(N)
"""
def stoneGame(self, piles):
"""
:type piles: List[int]
:rtype: bool
"""
if len(piles) % 2 == 0 or len(piles) == 1:
return True
dp = [0] * len(piles)
for i in reversed(range(len(piles))):
dp[i] = piles[i]
for j in range(i+1, len(piles)):
dp[j] = max(piles[i] - dp[j], piles[j] - dp[j - 1])
return dp[-1] >= 0
[19]:
s = Solution2()
piles = [5,3,4,5]
assert s.stoneGame(piles) == True